Automated Cross-lingual Link Discovery in Wikipedia

نویسندگان

  • Ling-Xiang Tang
  • Daniel Cavanagh
  • Andrew Trotman
  • Shlomo Geva
  • Yue Xu
  • Laurianne Sitbon
چکیده

At NTCIR-9, we participated in the cross-lingual link discovery (Crosslink) task. In this paper we describe our approaches to discovering Chinese, Japanese, and Korean (CJK) cross-lingual links for English documents in Wikipedia. Our experimental results show that a link mining approach that mines the existing link structure for anchor probabilities and relies on the “translation” using cross-lingual document name triangulation performs very well. The evaluation shows encouraging results for our system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Knowledge Discovery: Chinese-to-English Article Linking in Wikipedia

In this paper we examine automated Chinese to English link discovery in Wikipedia and the effects of Chinese segmentation and Chinese to English translation on the hyperlink recommendation. Our experimental results show that the implemented link discovery framework can effectively recommend Chinese-toEnglish cross-lingual links. The techniques described here can assist bi-lingual users where a ...

متن کامل

An evaluation framework for cross-lingual link discovery

Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case wit...

متن کامل

NTCIR-10 CrossLink-2 Task: A Link Mining Strategy

At NTCIR-10 we participated in the cross-lingual link discovery (CrossLink-2) task. In this paper we describe our systems for discovering cross-lingual links between the Chinese, Japanese, and Korean (CJK) Wikipedia and the English Wikipedia. The evaluation results show that our implementation of the crosslingual linking method achieved promising results.

متن کامل

Using Explicit Semantic Analysis for Cross-Lingual Link Discovery

This paper explores how to automatically generate cross-language links between resources in large document collections. The paper presents new methods for Cross-Lingual Link Discovery (CLLD) based on Explicit Semantic Analysis (ESA). The methods are applicable to any multilingual document collection. In this report, we present their comparative study on the Wikipedia corpus and provide new insi...

متن کامل

Cross-Lingual Link Discovery between Chinese and English Wiki Knowledge Bases

Wikipedia is an online multilingual encyclopedia that contains a very large number of articles covering most written languages. However, one critical issue for Wikipedia is that the pages in different languages are rarely linked except for the cross-lingual link between pages about the same subject. This could pose serious difficulties to humans and machines who try to seek information from dif...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011